AITopics

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.46)

Industry: Information Technology (0.47)

Technology:

Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsFeb-11-2026, 16:51:46 GMT

A.1 PyTorchpseudo-codeforMIRA Algorithm1PyTorchpseudo-codeofMIRA

In this subsection, we derive the necessary and sufficient condition in proposition??. Denote B,K be some natural numbers. We introduce the proposition from [8] that proves geometrical convergence of positive concave mapping. Bycorollary 2, g(v(n);Q) is a concave mapping. Wedonotapplyweightdecayanduse cosine scheduled the learning rate.

artificial intelligence, machine learning, point iteration, (17 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.32)

Neural Information Processing SystemsOct-8-2025, 16:47:57 GMT

51f9036d5e7ae822da8f6d4adda1fb39-Paper-Conference.pdf

sampling, variance, vertex, (15 more...)

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsAug-18-2025, 11:49:37 GMT

bedc61a9936af18cb51b7c5e8f3b89a3-Supplemental-Conference.pdf

artificial intelligence, evaluation, machine learning, (17 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.51)

Gabor, Mateusz, Piotrowski, Tomasz, Cavalcante, Renato L. G.

Positive concave deep equilibrium models

arXiv.org Artificial IntelligenceFeb-6-2024

Deep equilibrium (DEQ) models are widely recognized as a memory efficient alternative to standard neural networks, achieving state-of-the-art performance in language modeling and computer vision tasks. These models solve a fixed point equation instead of explicitly computing the output, which sets them apart from standard neural networks. However, existing DEQ models often lack formal guarantees of the existence and uniqueness of the fixed point, and the convergence of the numerical scheme used for computing the fixed point is not formally established. As a result, DEQ models are potentially unstable in practice. To address these drawbacks, we introduce a novel class of DEQ models called positive concave deep equilibrium (pcDEQ) models. Our approach, which is based on nonlinear Perron-Frobenius theory, enforces nonnegative weights and activation functions that are concave on the positive orthant. By imposing these constraints, we can easily ensure the existence and uniqueness of the fixed point without relying on additional complex assumptions commonly found in the DEQ literature, such as those based on monotone operator theory in convex analysis. Furthermore, the fixed point can be computed with the standard fixed point algorithm, and we provide theoretical guarantees of geometric convergence, which, in particular, simplifies the training process. Experiments demonstrate the competitiveness of our pcDEQ models against other implicit models.

activation function, mapping, pcdeq model, (14 more...)

2402.04029

Country:

Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
Asia > Japan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Bai, Xingjian, Melas-Kyriazi, Luke

Fixed Point Diffusion Models

arXiv.org Artificial IntelligenceJan-16-2024

We introduce the Fixed Point Diffusion Model (FPDM), a novel approach to image generation that integrates the concept of fixed point solving into the framework of diffusion-based generative modeling. Our approach embeds an implicit fixed point solving layer into the denoising network of a diffusion model, transforming the diffusion process into a sequence of closely-related fixed point problems. Combined with a new stochastic training method, this approach significantly reduces model size, reduces memory usage, and accelerates training. Moreover, it enables the development of two new techniques to improve sampling efficiency: reallocating computation across timesteps and reusing fixed point solutions between timesteps. We conduct extensive experiments with state-of-the-art models on ImageNet, FFHQ, CelebA-HQ, and LSUN-Church, demonstrating substantial improvements in performance and efficiency. Compared to the state-of-the-art DiT model, FPDM contains 87% fewer parameters, consumes 60% less memory during training, and improves image generation quality in situations where sampling computation or time is limited. Our code and pretrained models are available at https://lukemelas.github.io/fixed-point-diffusion-models.

diffusion model, iteration, timestep, (12 more...)

2401.08741

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)
(7 more...)

Genre:

Research Report > Promising Solution (0.54)
Overview > Innovation (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceNov-15-2023

Probabilistic Control and Majorization of Optimal Control

Lefebvre, Tom

Probabilistic control design is founded on the principle that a rational agent attempts to match modelled with an arbitrary desired closed-loop system trajectory density. The framework was originally proposed as a tractable alternative to traditional optimal control design, parametrizing desired behaviour through fictitious transition and policy densities and using the information projection as a proximity measure. In this work we introduce an alternative parametrization of desired closed-loop behaviour and explore alternative proximity measures between densities. It is then illustrated how the associated probabilistic control problems solve into uncertain or probabilistic policies. Our main result is to show that the probabilistic control objectives majorize conventional, stochastic and risk sensitive, optimal control objectives. This observation allows us to identify two probabilistic fixed point iterations that converge to the deterministic optimal control policies establishing an explicit connection between either formulations. Further we demonstrate that the risk sensitive optimal control formulation is also technically equivalent to a Maximum Likelihood estimation problem on a probabilistic graph model where the notion of costs is directly encoded into the model. The associated treatment of the estimation problem is then shown to coincide with the moment projected probabilistic control formulation. That way optimal decision making can be reformulated as an iterative inference problem. Based on these insights we discuss directions for algorithmic development.

formulation, objective, optimal control, (17 more...)

2205.03279

Country:

Europe > Belgium > Flanders > East Flanders > Ghent (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Industry: Energy (0.55)

Technology:

Information Technology > Control Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Geng, Zhengyang, Kolter, J. Zico

TorchDEQ: A Library for Deep Equilibrium Models

arXiv.org Artificial IntelligenceOct-28-2023

Deep Equilibrium (DEQ) Models, an emerging class of implicit models that maps inputs to fixed points of neural networks, are of growing interest in the deep learning community. However, training and applying DEQ models is currently done in an ad-hoc fashion, with various techniques spread across the literature. In this work, we systematically revisit DEQs and present TorchDEQ, an out-of-the-box PyTorchbased library that allows users to define, train, and infer using DEQs over multiple domains with minimal code and best practices. Using TorchDEQ, we build a "DEQ Zoo" that supports six published implicit models across different domains. By developing a joint framework that incorporates the best practice across all models, we have substantially improved the performance, training stability, and efficiency of DEQs on ten datasets across all six projects in the DEQ Zoo. TorchDEQ and DEQ Zoo are released as open source.

international conference, neural information processing system, torchdeq, (9 more...)

2310.18605

Genre: Research Report (0.64)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Gkillas, Alexandros, Ampeliotis, Dimitris, Berberidis, Kostas

Deep Equilibrium Models Meet Federated Learning

arXiv.org Artificial IntelligenceMay-29-2023

In this study the problem of Federated Learning (FL) is explored under a new perspective by utilizing the Deep Equilibrium (DEQ) models instead of conventional deep learning networks. We claim that incorporating DEQ models into the federated learning framework naturally addresses several open problems in FL, such as the communication overhead due to the sharing large models and the ability to incorporate heterogeneous edge devices with significantly different computation capabilities. Additionally, a weighted average fusion rule is proposed at the server-side of the FL framework to account for the different qualities of models from heterogeneous edge devices. To the best of our knowledge, this study is the first to establish a connection between DEQ models and federated learning, contributing to the development of an efficient and effective FL framework. Finally, promising initial experimental results are presented, demonstrating the potential of this approach in addressing challenges of FL.

artificial intelligence, deq model, machine learning, (16 more...)

2305.18646

Country:

South America > Peru > Loreto Department (0.04)
North America > Mexico > Gulf of Mexico (0.04)
Europe > Middle East > Cyprus (0.04)
Europe > Greece > West Greece > Patra (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Balın, Muhammed Fatih, Çatalyürek, Ümit V.

Layer-Neighbor Sampling -- Defusing Neighborhood Explosion in GNNs

arXiv.org Artificial IntelligenceMay-19-2023

Graph Neural Networks (GNNs) have received significant attention recently, but training them at a large scale remains a challenge. Mini-batch training coupled with sampling is used to alleviate this challenge. However, existing approaches either suffer from the neighborhood explosion phenomenon or have poor performance. To address these issues, we propose a new sampling algorithm called LAyer-neighBOR sampling (LABOR). It is designed to be a direct replacement for Neighbor Sampling (NS) with the same fanout hyperparameter while sampling up to 7 times fewer vertices, without sacrificing quality. By design, the variance of the estimator of each vertex matches NS from the point of view of a single vertex. Moreover, under the same vertex sampling budget constraints, LABOR converges faster than existing layer sampling approaches and can use up to 112 times larger batch sizes compared to NS.

artificial intelligence, machine learning, vertex, (19 more...)

2210.13339

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)